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Summary  of  Results 

In  this  section  we  will  present  a  brief  summary  of  our  research  results 

published  after  October  1,  1983.  The  reference  numbers  in  this  section  are 

r  "n 

keyed  to  the  publication  list  in  the  last  section>)  Host  of  our  research 

results  are  in  the  areas  of  quantization  theory  and  detection  theory. x  We 

will  begin  our  summary  by  discussing  our  results  in  quantization  theory; 

this  will  be  followed  by  a  presentation  of  our  results  in  detection  theory; 

and  finally,  we  will  mention  our  results  in  other  areas. 

Quantization  is  the  process  by  which  data  is  reduced  to  a  simpler, 

more  coarse  representation  which  is  more  compatible  with  digital  processing. 

Loosely  speaking,  quantization  is  at  the  heart  of  analog  to  digital 

conversion.  It  is  an  area  which  has  increased  in  importance  in  the  last 

few  years  due  to  the  burgeoning  advances  in  digital  technology.  The  typical 

goal  of  quantization  is  to  reduce  data  to  a  simpler  representation  without 

causing  much  distortion;  that  is,  the  output  of  a  quantizer  should  be  close 

to  the  input,  with  some  appropriate  measure  of  distance. /m\  H- level 

k  k 

k-dimensional  vector  quantizer  is  a  mapping  Q:  IR  IR  which  assigns  the 
input  vector  x  to  an  output  vector  Q(x)  chosen  from  a  set  of  N  vectors 
{y.j :  y^elR  ,i=l,...,N}.  Generally,  the  quantizer  input  is  modelled  as  a 
random  vector  X  described  by  a  k-variate  distribution  F.  A  measure  of 
quantizer  performance  is  the  distortion  function 

D(Q,F)  =  Jd(x,Q(x))dF(x),  (1) 

k  k 

where  d:  IR  *IR  -*-IR  is  an  appropriately  chosen  cost  function.  An  optimal 
N-level  quantizer  Q  for  the  random  vector  X  is  one  that  minimizes  (1)  over 
the  class  of  all  N-level  quantizers. 

There  had  apparently  been  a  long-standing  belief  among  researchers  in 
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quantization  theory  that  optimal  quantizers  always  exist.  This  existence 
is  important  from  the  viewpoint  of  numerical  design  algorithms  and  in 
studying  convergence  properties  of  sequences  of  quantizers;  also,  several 
results  in  quantization  theory  are  hypothesized  upon  the  existence  of 
optimal  quantizers.  Several  of  our  earlier  efforts  were  concerned  with 
establishing  conditions  guaranteeing  the  existence  of  optimal  quantizers. 

For  the  case  of  difference-based  distortion  functions,  i.e.  d(x,y)  = 

C(  1 1 x-y  ||),  we  completely  settled  the  existence  question;  in  [16]  we 
presented  necessary  and  sufficient  conditions  for  an  optimal  quantizer 
to  exist.  In  [15]  our  interest  was  primarily  concerned  with  convergence 
properties  of  sequences  of  quantizers;  however,  as  a  side  result,  we  did 
establish  a  condition  guaranteeing  existence  of  optimum  quantizers  for 
non-difference  based  distortion  functions  d(x,y).  This  result  provided 
a  counterexample  to  a  speculation  of  Gray,  Kieffer,  and  Linde  ( Information 
and  Control ,  May  1980). 

We  have  also  been  active  in  establishing  convergence  properties  of 

sequences  of  quantizers.  These  convergence  results  are  important  from 

the  viewpoint  of  numerical  design  algorithms,  and  they  yield  considerable 

insight  into  the  limiting  behavior  of  sequences  of  quantizers.  Suppose 

that  a  sequence  of  probability  measures  Pp  converges  weakly  to  a  probability 

measure  P.  Let  Q  be  an  optimal  N-level  quantizer  for  P  .  Does  the 
n  n 

distortion  associated  with  the  quantizer  Qn  and  the  measure  Pn  converge  to 
the  optimal  distortion  for  quantizing  P  with  N-levels?  Does  the  sequence 
of  optimal  quantizers  for  the  Pn's  converge  to  a  quantizer  Q;  and  if  so, 
is  Q  optimal  for  P?  Several  of  our  convergence  results  have  been  focused 
about  these  two  questions.  In  [3]  we  considered  difference  based  cost 
functions,  e.g.  d(x-y),  and  we  established  results  sufficient  for  affirmative 


answers  to  the  above  questions.  Then  in  [15]  we  established  conditions 
sufficient  for  affirmative  answers  for  non-difference  based  cost  functions, 
e.g.  d(x,y).  In  each  of  the  above  works  we  also  considered  the  above  two 
questions  where  Pn  represented  the  empirical  measure  based  on  n  iid  samples 
drawn  from  the  measure  P,  and  we  established  conditions  sufficient  for 
almost  sure  convergence  in  the  above  two  questions.  In  all  of  the  above 
convergence  results  we  chose  to  put  conditions  on  the  cost  function  rather 
than  the  distribution;  the  cost  function  is  easier  to  control  than  the 
(frequently  not  exactly  known)  underlying  distribution. 

These  convergence  results  for  sequences  of  quantizers  are  fairly 
general  and  they  form  powerful  tools  for  the  study  of  quantization.  For 
example,  one  of  the  more  practical  problems  associated  with  quantizers  is 
the  problem  of  how  to  construct  them.  Most  of  the  algorithms  for  quantizer 
design  involve  successively  improving  a  suboptimal  quantizer,  with  the 
procedure  hopefully  converging  to  an  optimal  quantizer.  The  above  results 
in  [3]  and  [15]  directly  address  this  situation.  For  example,  one  of  the 
currently  most  popular  design  algorithms  for  vector  quantization  is  the 
so-called  "Linde-Buzo-Gray  algorithm"  (IEEE  Transactions  on  Corrmcnications , 
January  1980).  As  a  by-product  of  one  of  our  results  in  [3],  we  established 
convergence  of  this  algorithm  for  r-th  power  distortion  measures,  i.e. 
d(x,y)  =  1 1  x-y  1 1 r.  This  is  the  first  rigorous  convergence  result  for  this 

algorithm  in  a  reasonably  general  context.  In  [13]  we  used  the  results  in 
[3]  to  investigate  convergence  properties  of  quantizer  design  via  successive 
improvement  upon  suboptimal  quantizers.  If  the  input  distribution  F  were 

A 

not  known,  we  might  form  an  estimate  F_  based  on  n  observations  of  the 

n 

input  signal.  As  n  becomes  large,  we  might  expect  a  reasonable  estimate 
to  converge  to  the  true  distribution  F.  Intuitively,  then,  an  optimal 


quantizer  designed  for  Fn>  and  the  resulting  distortion,  should  closely 
approximate  those  of  an  optimal  quantizer  for  F.  In  [13]  we  established 
properties  of  an  estimator  Fn  so  t*a.t  the  above  reasoning  would  be  valid. 

Another  aspect  of  our  work  on  quantization  was  concerned  with 
practical,  numerically-oriented  design  techniques  for  scalar  quantizers. 
Although  the  advantages  of  vector,  or  block,  quantization  are  well  known, 
scalar  quantizers  are  nevertheless  In  widespread  use.  In  spite  of 
numerous  elegant  results  in  quantization  theory,  the  actual  practical 
numerical  design  of  scalar  quantizers  is  still  a  challenging  problem. 

In  [8]  we  presented  a  simple  and  straightforward  technique  for  constructing 
minimum  mean  squared  error  symmetric  uniform  scalar  quantizers  for  some 
common  distributions  on  the  data. 

In  the  context  of  scalar  minimum  mean  squared  error  quantization, 

one  of  the  most  popular  design  techniques  is  the  Lloyd-Max  algorithm  ( ire 

Transactions  on  Information  Theory ,  March  1960  and  IEEE  Transactions  on 

Information  Theory ,  March  1982).  Unfortunately,  two  potential  problems 

arise  with  the  Lloyd-Max  scheme.  The  first  problem  is  how  to  get  a  good 

initial  guess  for  starting  the  iterative  scheme,  and  the  second  problem 

is  how  to  intelligently  update  the  algorithm.  Both  of  these  problems 

were  addressed  in  [1]  and  [18]  for  some  common  distributions  on  the  data. 

Our  modifications  of  the  Lloyd-Max  algorithm  resulted  in  a  very  fast  design 

algorithm  for  scalar  minimum  mean  squared  error  quantization.  For  example, 

we  designed  a  64-level  quantizer  for  a  Gaussian  distribution  with  a  high 

degree  of  accuracy  (the  terminating  condition  for  the  Lloyd-Max  algorithm 
-8 

was  set  at  10  )  in  0.136  seconds  of  computer  time  on  a  CDC  Cyber  170/750. 

One  of  the  non-mean-squared  error  criteria  that  frequently  appears 
in  the  literature  is  the  criterion  of  mean  absolute  error.  In  [11]  we 


presented  an  efficient  method  for  the  design  of  scalar  minimum  mean 
absolute  error  quantizers.  Our  method  was  based  upon  a  modification  of 
the  Lloyd-Max  algorithm  mentioned  above.  As  an  example,  we  designed  a 
64-level  minimum  mean  absolute  error  quantizer  for  a  standard  normal 
random  variable.  Also  in  [11]  we  gave  a  closed-form  solution  for  the 
minimum  mean  absolute  error  quantizer  for  a  Laplace  random  variable. 

This  closed-form  solution  stands  in  marked  contrast  to  the  laborious 
numerical  procedures  often  encountered  in  quantizer  design  problems. 

A  popular  way  of  realizing  a  scalar  quantizer  is  via  a  method  known 
as  companding.  A  companding  system  consists  of  an  invertible  function 
G:  IR  -*-[0,1]  followed  by  a  uniform  N-level  quantizer  on  [0,1],  followed 
by  the  inverse  function  G~^ ( • ) .  Any  arbitrary  N  level  scalar  quantizer 
can  be  realized  via  a  companding  system.  This  technique  leads  to  a  closed 
form  solution;  however,  it  is  asymptotic  in  nature.  In  some  cases  the 
accuracy  of  the  companding  method  has  been  overrated.  In  [7]  we  presented 
a  simple  modification  for  improving  the  accuracy  of  the  companding  scheme. 

For  the  generalized  Gaussian  density,  f (x)  =  A  exp[-c|x|p],  this  modification 
resulted  in  a  straightforward  formula  for  constructing  a  better  compressor 
function  G. 

Another  research  area  in  which  we  have  recently  obtained  results  is 
the  area  of  signal  detection.  The  detection  problem  is  modeled  as  a  test 
between  two  statistical  hypotheses;  we  assume  that  under  the  null  hypothesis 
noise  alone  is  being  observed,  and  under  the  alternate  hypothesis  a  signal 
plus  noise  is  being  observed.  We  considered  discrete  time  detection,  ... 
where  we  assumed  that  the  observation  is  indexed  by  a  subset  of  the 
integers,  e.g.  x-j.Xg,...^. 

In  the  case  of  discrete  time  detection  where  the  noise  and  the  signal 


are  stationary  and  the  samples  are  independent,  it  is  well  known  that  the 
Neyman-Pearson  test  has  a  test  statistic  which  can  be  expressed  as 

i-i  9(Xl) 

where  X^,  i=l,...,n,  represent  the  observations,  and  g(*)  is  an  appro¬ 
priately  chosen  function.  In  earlier  work  we  had  considered  the  problem 
of  constraining  the  test  statistic  to  be  of  the  above  form  and  letting  the 
noise  samples  be  "slightly"  dependent.  We  then  tried  to  choose  the  function 
g ( - )  to  best  account  for  the  dependency  structure  of  the  noise,  in  the  sense 
of  the  asymptotic  relative  efficiency  (or  Pitman  efficiency)  with  respect 
to  any  other  choice  for  g(*)«  In  [4]  we  investigated  the  problem  of  how 
to  choose  g ( • )  when  both  the  signal  and  the  noise  were  modelled  as  <f>-mixing 
random  processes,  where  we  also  allowed  the  noise  to  be  dependent  on  the 
signal  over  a  finite  window,  such  as  signal  dependent  noise  induced  through 
reverberation  effects.  In  [5]  we  considered  the  problem  of  approximating 
an  optimal  g(*)  by  a  sequence  of  Bore!  measurable  functions  (g^ (•)}.  We 
compared  the  performance  resulting  from  the  approximate  nonlinearities  to 
the  optimal  performance,  and  we  showed  that  the  loss  in  performance  can  be 
made  arbitrarily  small  by  making  g^*)  appropriately  close  to  g ( • ) .  We 
allowed  a  strong  mixing  dependency  structure  for  the  (random)  signal  and 
the  noise,  and  we  considered  as  examples  specific  forms,  e.g.  quantizers, 
polynomials,  for  the  g^*)-  In  [6]  and  [19]  we  continued  part  of  this 
investigation.  Here  we  were  concerned  specifically  with  approximating  the 
nonlinearity  g(*);  and  our  interest  was  in  establishing  a  lower  bound  on 
the  performance,  where  the  lower  bound  was  a  function  of  the  Lg  distance 
between  the  optimal  g(*)  and  the  actual  nonlinearity  of  interest.  Notice 
that  for  several  reasons  one  might  not  use  the  optimal  g(»);  for  example. 


numerical  approximations  may  be  employed  in  solving  for  g(*)»  some  of  the 
statistical  information  necessary  for  determining  g( - )  may  only  be 
approximated,  or  perhaps  one  introduces  another  nonlinearity  in  an  attempt 
to  lend  robustness  properties  to  the  detection  scheme.  Our  results  in 
[6]  and  [19]  directly  address  the  question  of  how  the  asymptotic 
performance  is  degraded  by  perturbations  in  g(»)* 

The  relative  efficiency  between  two  detectors  is  a  ratio  of  the  amount 
of  data  required  by  one  detector,  relative  to  another,  to  attain  a  prescribed 
level  of  performance.  Although  this  concept  is  of  fundamental  importance 
in  the  theory  of  signal  detection,  it  has  been  successfully  investigated  in 
only  very  few  special  cases.  As  an  approximation  to  the  relative  efficiency, 
engineers  have  frequently  employed  the  asymptotic  relative  efficiency  (ARE), 
the  limiting  value  of  the  relative  efficiency  (under  suitable  regularity 
conditions)  as  the  sample  sizes  required  by  the  detectors  approach  infinity 
The  ARE  was  introduced  in  the  statistical  literature,  where  it  is  known  as 
the  Pitman  efficiency.  Usually  it  can  be  determined  in  a  fairly  straight¬ 
forward  fashion,  and  this  is  due  principally  to  an  appeal  to  the  central 
limit  theorem.  The  ARE  is  a  limiting  result;  and  in  any  practical 
engineering  situation,  only  a  finite  number  of  samples  can  be  taken  in  the 
context  of  discrete  time  detection.  Thus  it  might  not  always  be  appropriate 
to  approximate  the  relative  efficiency  with  the  ARE.  In  [10]  we  considered 
the  discrete  time  detection  of  a  known  time  varying  signal  in  additive 
noise,  where  the  noise  sequence  is  assumed  to  be  a  sequence  of  iid  random 
variables;  and  we  studied  the  relative  efficiency  of  the  sign  detector,  a 
popular  nonparametric  detector,  and  the  correlation  detector,  which  is 
Neyman-Pearson  optimal  in  the  case  when  the  noise  is  Gaussian.  In  this 
work  [10]  we  presented  results  illustrating  the  convergence  of  relative 


efficiencies  for  both  Gaussian  noise  and  Laplace  noise.  Some  examples  were 
given  where  the  relative  efficiencies  did  not  quickly  converge  to  the  ARE. 

In  this  work  [10]  we  also  presented  bounds  on  the  relative  efficiency  in 
the  case  where  the  (deterministic)  signal  was  unknown;  for  example,  it 
might  only  be  known  that  at  the  i-th  sample,  s^  -e  <  s^  <  s,  +e,  where  s^ 
represents  the  signal,  and  s^  and  e  are  known. 

Kassam  and  Thomas  ( IEEE  Transactions  on  Information  Theory ,  July  1975) 
considered  the  discrete  time  detection  of  a  constant  signal  in  m-dependent 
noise.  This  scheme  consisted  of  summing  the  first  n  samples,  skipping 
(i.e.  throwing  away)  the  next  m,  summing  the  next  n,  skipping  the  next  m, 
etc.  They  then  applied  the  classical  sign  detector  to  the  sequence  of  sums, 
and  they  concluded  that,  asymptotically,  n  should  be  chosen  as  large  as 
possible  to  maximize  performance.  They  then  concluded  that  this  method 
could  be  extended  to  noise  sequences  that  were  strong  mixing,  and  that 
results  under  an  m-dependent  assumption  yielded  very  close  approximations. 

In  [17]  we  presented  a  rigorous  analysis  of  this  conjectural  conclusion, 
and  we  showed  that  a  considerably  more  careful  analysis  was  necessary  for 
the  case  of  strong  mixing  noise.  We  showed  how  such  a  nonparametric  detector 
may  be  designed.  We  established  an  upper  bound  on  the  asymptotic  performance 
and  we  specified  the  form  of  a  detector  which  achieves  this  upper  bound.  In 
[17]  we  also  considered  the  design  of  the  detector  under  a  finite  sample 
(i.e.  non-asymptotic)  criterion,  and  we  showed  that  there  can  be  a  marked 
difference  in  the  detector  designs  resulting  from  the  two  criteria  (i.e. 
asymptotic  and  non-asymptotic). 

Consider  detecting  a  deterministic  time  varying  signal  in  additive 
noise  based  on  a  fixed  (finite)  number  of  observations.  If  the  noise 
process  is  mutually  independent,  then  the  solution  of  the  problem  is  easily 


formulated  in  terms  of  the  Neyman-Pearson  criterion  in  which  the  detection 
probability  is  maximized  for  a  constrained  false  alarm  probability.  The 
resultant  detector  is  then  implemented  by  comparing  the  output  of  a 
transformation  of  the  data  to  a  threshold,  the  transformation  being 
obtainable  from  the  univariate  noise  distributions.  However,  with  today's 
high  sampling  rates,  an  assumption  of  independent  samples  is  becoming 
increasingly  inappropriate.  Although  the  Neyman-Pearson  criterion  can 
still  be  applied  in  theory,  the  presence  of  dependency  greatly  compromises 
its  application.  Lack  of  knowledge  of  the  higher  order  noise  distributions 
results  in  the  inability  to  specify  completely  the  required  transformation 
(the  likelihood  ratio).  We  therefore  have  a  situation  in  which  the 
problem  is  tractable  under  an  independence  assumption  but  it  should  most 
properly  be  approached  under  the  dependence  assumption.  Often  in  the  past, 
whatever  dependency  has  been  present  has  been  ignored  in  order  to  obtain 
tractable  results.  This  has  led  to  variations  in  the  nominal  values  of 
the  detection  probability  and  the  false  alarm  probability  because  of  the 
residual  dependency.  If  the  dependency  was  "weak",  then  one  would  hope 
that  these  variations  would  be  acceptably  small.  In  [2]  and  [14]  we 
investigated  quantitative  conditions  which  allowed  determining  when  the 
dependency  can  be  ignored,  and  we  presented  a  result  which  allowed  bounding 
the  variations  in  the  detection  probability  and  the  false  alarm  probability 
induced  by  ignoring  the  dependency. 

Consider  once  again  the  discrete  time  detection  of  a  signal  in 
additive  noise.  Under  a  variety  of  fidelity  criteria,  an  optimal  detector 
consists  of  mapping  the  data  into  the  real  numbers  via  the  likelihood  ratio 
and  then  comparing  the  result  to  an  appropriate  threshold  (determined  by 
the  fidelity  criterion).  Clearly,  the  likelihood  ratio  represents  the 


actual  "processing"  of  the  data.  Assume  that  the  noise  distribution  is 
changed  from  its  nominal  model.  When  does  the  resulting  likelihood  ratio 
(i.e.  the  data  processor)  change?  In  [12]  we  considered  this  situation 
and  we  completely  characterized  the  situation  where  the  noise  distribution 
can  change  but  the  likelihood  ratio  remains  unchanged.  In  particular,  we 
produced  examples  where  the  noise  distribution  can  change  dramatically, 
but  the  likelihood  ratio  remains  the  same. 

In  [9]  we  investigated  an  existing  method  (Delp  and  Mitchell,  IEEE 
Transactions  on  Communications t  September  1979)  for  image  compression  known 
as  block  truncation  coding.  The  basic  block  truncation  coding  approach 
employs  a  two  level  quantizer  whose  output  levels  are  obtained  through 
matching  the  first  two  sample  moments  of  the  data  before  and  after 
quantization.  We  generalized  this  basic  block  truncation  coding  approach 
by  using  two  level  quantizers  which  preserve  higher  order  moments.  This 
generalization  offered  the  potential  for  improved  performance.  Some 
examples  were  given  to  illustrate  the  improvement  in  image  quality. 

Finally,  in  [20]  we  pointed  out  that  even  for  bounded  random  variables 
the  conditional  expectation  does  not  always  yield  a  minimum  mean  squared 
error  estimate.  That  is,  we  constructed  two  bounded  random  variables  X 
and  Y  and  a  function  f:  IR  +IR  such  that  Y  =f(X)  pointwise  on  the  underlying 
probability  space  but  E{(Y-E{Y| X})2}  >  10^°. 


Research  in  Progress 

Our  research  is  progressing  very  well  in  several  directions.  In 


this  section  we  will  briefly  describe  the  problems  we  are  currently 
investigating. 

The  newest  research  direction  we  are  pursuing  and  the  one  in  which 
most  of  our  effort  is  currently  being  expended  deals  with  several  aspects 
of  conditional  expectations.  Naturally,  this  is  closely  aligned  with 
mean  squared  error  estimation.  For  example,  let  Y  denote  a  second  order 
random  variable  of  interest,  and  let  X^,...,X^  denote  our  data.  One  might 

A 

decide  to  estimate  Y  by  using  Y  =  E{Y|X-j ,. . .  .X^}.  However,  in  a  practical 
situation,  the  data  is  better  modeled  as  (X^ ) ,Q2 (X2 ) * . •• .Qk(Xk),  the 
result  of  an  analog  to  digital  conversion  of  the  observations.  This  analog 
to  digital  conversion  would  be  the  result  of  the  digitization  of  the 
observations;  for  example,  they  might  be  stored  in  a  digital  computer. 


Thus,  perhaps  we  should  use  as  our  estimate  of  Y  the  quantity  Y  = 

E{Y|Q1(X1), — ,Qk(xk)}-  How  does  E{(Y-Y)2}  compare  with  E{(Y-Y)2}?  How 
should  the  quantizers  {Q . }  be  designed  to  make  E( ( Y-Y)2}  close  to  E{ ( Y-Y )2}? 
We  are  presently  investigating  this  situation. 

Another  aspect  of  our  investigations  deals  with  the  continuity  of 
a-algebras  generated  by  random  processes.  Let  X ( t )  denote  a  random 
process,  and  let  Ft  =  o{X(s),  S£t}.  Define 

^t+  =  ^s 
1  s  >  t 


Ft  =  o{  U  FI. 
s  <  t  5 

We  say  that  the  flow  {F^}  is  continuous  at  t  if  F^_  =  Ft  =  F^+.  In 
numerous  popular  works  on  estimation  theory  (e.g.  Gihman  and  Skorohod, 


The  Theory  of  Stochastic  Processes ,  Vols.  I,  II,  III  and  Liptser  and 
Shiryayev,  Statistics  of  Random  Processes  I  and  II),  it  is  simply  assumed 
that  the  flow  of  o-algebras  is  continuous,  and  this  assumption  plays  a 
fundamental  role  in  many  of  the  results.  How  restrictive  is  this 
assumption?  We  are  currently  investigating  properties  of  X(t)  that  are 
consistent  with  the  continuity  of  F^.  Our  present  results  indicate  that 
there  is  little  if  any  relation  between  the  sample  path  regularity  of 
X(t)  and  the  continuity  of  F^.  For  example,  we  can  exhibit  random  processes 
with  real  analytic  sample  functions  and  discontinuous  o-algebras,  and  we 
can  exhibit  random  processes  with  non-Lebesgue  measurable  sample  functions 
and  continuous  o-algebras.  As  implied  earlier,  the  results  of  this 
investigation  are  pertinent  to  the  applicability  of  the  results  in  several 
popular  texts.  Also,  these  results  are  fundamental  to  relating  estimates 
based  on  observing  a  random  process  over  an  interval  to  estimates  based  on 
observing  a  random  process  at  only  a  finite  set  of  times.  For  example, 
how  does 

E{Y|X(s),  se[a,b]} 

compare  to 

E{Y|X(s.),  i=l,...,n}, 

where  the  s.e[a,b]?  Can  we  make  them  close  in  some  sense?  How  should  the 
observation  times  s..  be  chosen? 

One  of  the  more  practical  problems  we  are  investigating  is  data 
reduction  for  image  processing.  Consider  an  image  composed  of  pixels  taking 
on  one  of  several  gray  levels.  For  example,  if  there  are  2^  gray  levels, 
then  each  pixel  can  be  represented  by  using  b  bits,  and  each  image  would 
therefore  be  representable  by  a  certain  number  of  bits.  We  are  presently 
investigating  a  method  for  reducing  the  number  of  bits  used  to  represent 


an  image  without  altering  the  image  very  much.  Our  results  in  this  area 
are  still  in  an  embryonic  stage.  We  hope  to  characterize  a  class  of  images 
and  a  method  of  data  reduction  so  that  the  data  can  be  reduced  by  a  factor, 
say  k:l,  and  at  the  same  time  the  image  will  undergo  only  negligible 
alteration. 

Our  current  work  in  the  theory  of  signal  detection  is  moving  away  from 
asymptotic  results  and  more  toward  detection  based  on  a  finite  number  of 
observations.  Two  main  directions  in  our  investigation  of  signal  detection 
are  concerned  with  properties  of  the  relative  efficiency  between  detectors 
and  with  consequences  of  robustness  in  detection  schemes.  Engineers  have 
often  used  the  asymptotic  relative  efficiency  between  two  detectors  as  a 
way  of  comparing  the  detectors.  However,  in  a  practical  situation,  the 
quantity  of. concern  is  actually  the  relative  efficiency  (based  on  a  finite 
number  of  samples).  As  mentioned  in  the  previous  section,  we  have  already 
achieved  some  preliminary  results  in  this  area.  Another  aspect  of  signal 
detection  that  we  are  currently  investigating  is  concerned  with  the  concept 
of  robust  signal  detection.  A  saddle  point  approach  to  robust  hypothesis 
testing  was  established  in  the  1960’s  by  Peter  Huber.  In  the  last  few 
years  several  investigators  in  signal  processing  have  applied  these 
results  to  some  situations  in  signal  detection.  However,  there  appears 
to  be  an  Inadequate  degree  of  understanding  concerning  the  performance  of 
these  robust  detection  schemes  In  particular  situations.  For  example, 
consider  a  simple  hypothesis  test  using  a  nominal  distribution.  Now 
consider  testing  composite  hypotheses  by  letting  the  underlying  noise 
distribution  be  allowed  to  vary  from  the  nominal  distribution  within 
appropriately  defined  neighborhoods  (e.g.  Prohorov  distance,  Kolmorgorov 
distance.  Levy  distance,  etc.),  and  consider  a  robust  detector  designed 


for  this  second  situation.  Naturally,  this  has  the  pleasing  attribute  of 
being  robust,  but  the  question  remains  as  to  how  the  performance  is 
affected. by  using  the  robust  detector.  For  example,  assume  that  the 
robust  detector  is  used  for  the  nominal  distribution.  How  much  worse 
will  the  performance  be  than  if  the  Neymann-Pearson  detector  had  been 
used?  Our  present  investigations  are  addressing  this  matter,  and  we  have 
found  some  cases  where  the  robust  detector  for  the  nominal  distribution 
gave  a  detection  probability  less  than  half  of  that  given  by  a  Neymann- 
Pearson  detector  (where  both  detectors  had  the  same  false  alarm 
probability). 

The  above  surmary  describes  our  ongoing  research.  In  the  near  future 
we  hope  to  focus  more  on  data  processing  schemes  designed  under  imperfect 
or  erroneous  assumptions. 
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